智能论文笔记

Statistical Properties of the Entropy from Ordinal Patterns

Eduarda T. C. Chagas , Alejandro. C. Frery , Juliana Gambini , Magdalena M. Lucini , Heitor S. Ramos , Andrea A. Rey

分类：机器学习

2022-09-15

序数模式的统计分析的最终目的是表征它们诱导的特征的分布。特别是，了解大类时间序列模型的对熵统计复杂性的联合分布将允许迄今无法获得的统计测试。在这个方向上工作，我们表征了Shannon经验的渐进分布，用于任何模型，在此模型中，真正的归一化熵既不为零也不为零。我们从中心极限定理（假设大时间序列），多元增量方法和其平均值的三阶校正获得了渐近分布。我们讨论了其他结果（精确，一阶和二阶校正）有关其准确性和数值稳定性的适用性。在建立有关香农熵的测试统计数据的一般框架内，我们提出了双边测试，该测试验证是否有足够的证据拒绝以下假设，即两个信号产生了具有相同Shannon熵的顺序模式。我们将此双边测试应用于来自三个城市（都柏林，爱丁堡和迈阿密）的每日最高温度时间序列，并获得了明智的结果。

translated by 谷歌翻译

Leveraging Synthetic Data to Learn Video Stabilization Under Adverse Conditions

Abdulrahman Kerim , Washington L. S. Ramos , Leandro Soriano Marcolino , Erickson R. Nascimento , Richard Jiang

分类：计算机视觉

2022-08-26

视频稳定在提高视频质量方面起着核心作用。但是，尽管这些方法取得了很大的进展，但它们主要是在标准天气和照明条件下进行的，并且在不利条件下的性能可能会差。在本文中，我们提出了一种用于视频稳定的综合感知不良天气鲁棒算法，该算法不需要真实数据，并且只能在合成数据上接受培训。我们还提出了Silver，这是一种新颖的渲染引擎，可通过自动地面提取程序生成所需的训练数据。我们的方法使用我们的特殊生成的合成数据来训练仿射转换矩阵估计器，避免了当前方法面临的特征提取问题。此外，由于在不利条件下没有视频稳定数据集，因此我们提出了新颖的VSAC105REAL数据集以进行评估。我们将我们的方法与使用两个基准测试的五种最先进的视频稳定算法进行了比较。我们的结果表明，当前的方法在至少一个天气条件下的表现差，即使在一个具有合成数据的小数据集中培训，我们就稳定性得分，失真得分，成功率和平均种植方面取得了最佳性能考虑所有天气条件时的比率。因此，我们的视频稳定模型在现实世界的视频上很好地概括了，并且不需要大规模的合成训练数据来收敛。

translated by 谷歌翻译

HTML版本

Fast & Furious: Modelling Malware Detection as Evolving Data Streams

Fabrício Ceschin , Marcus Botacin , Heitor Murilo Gomes , Felipe Pinagé , Luiz S. Oliveira , André Grégio

分类：机器学习

2022-05-24

恶意软件是对计算机系统的主要威胁，并对网络安全构成了许多挑战。有针对性的威胁（例如勒索软件）每年造成数百万美元的损失。恶意软件感染的不断增加一直激励流行抗病毒（AV）制定专用的检测策略，其中包括精心制作的机器学习（ML）管道。但是，恶意软件开发人员不断地将样品的功能更改为绕过检测。恶意软件样品的这种恒定演变导致数据分布（即概念漂移）直接影响ML模型检测率，这是大多数文献工作中未考虑的。在这项工作中，我们评估了两个Android数据集的概念漂移对恶意软件分类器的影响：DREBIN（约130k应用程序）和Androzoo（约350K应用程序）的子集。我们使用这些数据集训练自适应随机森林（ARF）分类器以及随机梯度下降（SGD）分类器。我们还使用其Virustotal提交时间戳订购了所有数据集样品，然后使用两种算法（Word2Vec和tf-idf）从其文本属性中提取功能。然后，我们进行了实验，以比较两个特征提取器，分类器以及四个漂移检测器（DDM，EDDM，ADWIN和KSWIN），以确定真实环境的最佳方法。最后，我们比较一些减轻概念漂移的可能方法，并提出了一种新的数据流管道，该管道同时更新分类器和特征提取器。为此，我们通过（i）对9年来收集的恶意软件样本进行了纵向评估（2009- 2018年），（ii）审查概念漂移检测算法以证明其普遍性，（iii）比较不同的ML方法来减轻此问题，（iv）提出了超过文献方法的ML数据流管道。

translated by 谷歌翻译

Joint machine learning analysis of muon spectroscopy data from different materials

T. Tula , G. Möller , J. Quintanilla , S. R. Giblin , A. D. Hillier , E. E. McCabe , S. Ramos , D. S. Barker , S. Gibson

分类： (统计)机器学习

2021-12-17

机器学习（ML）方法已被证明是物理科学中非常成功的工具，特别是在应用于实验数据分析时。人工智能特别擅长在高维数据中识别模式，通常优于人类。在这里，我们应用了一个名为主成分分析（PCA）的简单ML工具，以研究来自μON光谱的数据。来自该实验的测量数量是不对称功能，其具有关于样品的平均内在磁场的信息。不对称功能的变化可能表示相变;然而，这些变化可能非常微妙，并且现有的分析方法需要了解材料的特定物理。 PCA是一个无人驾驶的ML工具，这意味着不需要对输入数据的假设，但我们发现它仍然可以成功应用于不对称曲线，并且可以恢复相位转换的指示。将该方法应用于具有不同底层物理的一系列磁性材料。我们发现，同时对所有这些材料进行PCA可以对相变指示器的清晰度产生积极影响，并且还可以改善不对称功能最重要变化的检测。对于这个联合PCA，我们介绍了一种简单的方法来跟踪不同材料的贡献以获得更有意义的分析。

translated by 谷歌翻译

Accelerating non-LTE synthesis and inversions with graph networks

A. Vicente Arévalo , A. Asensio Ramos , S. Esteban Pozuelo

分类：机器学习

2021-11-20

背景信息：快速非LTE合成的计算成本是限制2D和3D反转码的开发的挑战之一。它还使得对在铬圈和过渡区域中形成的线的观察的解释是缓慢和计算昂贵的过程，这限制了在相当小的视野上的物理性质的推断。通过出发系数访问从LTE制度的快速计算偏差的方式可能在很大程度上减轻了这个问题。目的：我们建议建立并培训图形网络，该图网络快速预测原子级群体而不解决非LTE问题。方法：我们找到了图形网络的最佳架构，用于预测来自模型气氛的物理条件的原子水平的偏离系数。具有具有潜在模型气氛的代表性样本的合适数据集用于培训。使用现有的非LTE合成代码计算了该数据集。结果：图形网络已集成到现有的\ Caii案例中的现有合成和反演代码中。我们在计算速度上展示了数量级增益的顺序。我们分析了图形网络的泛化能力，并证明它为看不见的模型产生了良好的预测偏离系数。我们在\ Hazel \中实现此方法，并显示与使用标准非LTE反转代码获得的那些相比如何与之比较。我们的近似方法开辟了在大视野中从铬圈提取物理信息的可能性，随着时间的演变。这使我们能够了解更好的太阳区域，其中大的空间和时间尺度至关重要。

translated by 谷歌翻译

Genetic Algorithms For Extractive Summarization

William Chen , Kensal Ramos , Kalyan Naidu Mullaguri , Annie S. Wu

分类：自然语言处理 | 人工智能

2021-05-05

NLP中最新的工作利用深度学习，这需要大量的培训数据和计算能力。本文研究了遗传算法（气体）的提取摘要，因为我们假设气体可以为摘要任务构建更有效的解决方案，因为它们相对于深度学习模型相对定制。这是通过构建词汇集来完成的，其中的单词表示为权重阵列，并用GA优化那些权重集合。这些权重可用于构建句子的总加权，然后可以传递到一些阈值进行提取。我们的研究结果表明，GA能够学习一个体重表示，这可能会过滤出过度的词汇，从而根据常见的英语单词决定句子重要性。

translated by 谷歌翻译

Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

James K. He , Sofía S. Villar , Lida Mavrogonatou

分类：机器学习

2023-01-03

Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.

translated by 谷歌翻译

e-Inu: Simulating A Quadruped Robot With Emotional Sentience

Abhiruph Chakravarty , Jatin Karthik Tripathy , Sibi Chakkaravarthy S , Aswani Kumar Cherukuri , S. Anitha , Firuz Kamalov , Annapurna Jonnalagadda

分类：机器人 | 机器学习

2023-01-03

Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.

translated by 谷歌翻译

3DSGrasp: 3D Shape-Completion for Robotic Grasp

Seyed S. Mohammadi , Nuno F. Duarte , Dimitris Dimou , Yiming Wang , Matteo Taiana , Pietro Morerio , Atabak Dehban , Plinio Moreno , Alexandre Bernardino , Alessio Del Bue

分类：机器人 | 人工智能

2023-01-02

Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.

translated by 谷歌翻译

SIRL: Similarity-based Implicit Representation Learning

Andreea Bobu , Yi Liu , Rohin Shah , Daniel S. Brown , Anca D. Dragan

分类：机器人 | 人工智能 | 机器学习

2023-01-02

When robots learn reward functions using high capacity models that take raw state directly as input, they need to both learn a representation for what matters in the task -- the task ``features" -- as well as how to combine these features into a single objective. If they try to do both at once from input designed to teach the full reward function, it is easy to end up with a representation that contains spurious correlations in the data, which fails to generalize to new settings. Instead, our ultimate goal is to enable robots to identify and isolate the causal features that people actually care about and use when they represent states and behavior. Our idea is that we can tune into this representation by asking users what behaviors they consider similar: behaviors will be similar if the features that matter are similar, even if low-level behavior is different; conversely, behaviors will be different if even one of the features that matter differs. This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not. The notion of learning representations based on similarity has a nice parallel in contrastive learning, a self-supervised representation learning technique that maps visually similar data points to similar embeddings, where similarity is defined by a designer through data augmentation heuristics. By contrast, in order to learn the representations that people use, so we can learn their preferences and objectives, we use their definition of similarity. In simulation as well as in a user study, we show that learning through such similarity queries leads to representations that, while far from perfect, are indeed more generalizable than self-supervised and task-input alternatives.

translated by 谷歌翻译